Using Text's Terms and Syntactical Properties for Document Similarity
نویسندگان
چکیده
منابع مشابه
Using Text's Terms and Syntactical Properties for Document Similarity
This paper reports on experiments performed to investigate the use of syntactical structures of sentences combined with sentences' terms for document similarity calculation. The document's sentences were first converted into ordered Part of Speech (POS) tags that were then fed into the Longest Common Subsequence (LCS) algorithm to determine the size and count of the LCSs found when comparing th...
متن کاملDocument Similarity Amid Automatically Detected Terms∗
This is the second edition of the task formally known as Question Answering for the Spoken Web (QASW). It is an information retrieval evaluation in which the goal was to match spoken Gujarati “questions” to spoken Gujarati responses. This paper gives an overview of the task—design of the task and development of the test collection—along with differences from previous years.
متن کاملDocument Retrieval using Predication Similarity
Document retrieval has been an important research problem over many years in the information retrieval community. State-of-the-art techniques utilize various methods in matching documents to a given document including keywords, phrases, and annotations. In this paper, we propose a new approach for document retrieval that utilizes predications (subject-predicate-object triples) extracted from th...
متن کاملDocument Similarity Judgment for Interactive Document Clustering
This paper investigates the task of document similarity judgment for interactive document clustering. We suppose one of the promising approaches for developing next generation of web search engines is to incorporate user feedback mechanism into constrained clustering. As a basis for designing such search engines, it is important to study the interface design that can reduce user' burden of givi...
متن کاملDocuments similarity measurement using field association terms
Conventional approaches to text analysis and information retrieval which measured document similarity by using considering all of the information in texts are a relatively inefficiency for processing large text collections in heterogeneous subject areas. This paper outlined a new text manipulation system FA-Sim that is useful for retrieving information in large heterogeneous texts and for recog...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Intelligent Information Systems
سال: 2016
ISSN: 2328-7675
DOI: 10.11648/j.ijiis.20160506.11